944 research outputs found

    Serum levels of mature microRNAs in DICER1-mutated pleuropulmonary blastoma.

    Get PDF
    DICER1 is a critical gene in the biogenesis of mature microRNAs, short non-coding RNAs that derive from either -3p or -5p precursor microRNA strands. Germline mutations of DICER1 are associated with a range of human malignancies, including pleuropulmonary blastoma (PPB). Additional somatic 'hotspot' mutations in the microRNA processing ribonuclease IIIb (RNase IIIb) domain of DICER1 are reported in cancer, and which affect microRNA biogenesis, resulting in a -3p mature microRNA strand bias. Here, in a germline (exon11 c.1806_1810insATTGA) DICER1-mutated PPB, we first confirmed the presence of an additional somatic RNase IIIb hotspot mutation (exon25 c.5425G>A [p.G1809R]) by conventional sequencing. Second, we investigated serum levels of mature microRNAs at the time of PPB diagnosis, and compared the findings with serum results from a comprehensive range of pediatric cancer patients and controls (n=52). We identified a panel of 45 microRNAs that were present at elevated levels in the serum at the time of PPB diagnosis, with a significant majority noted be derived from the -3p strand (P=0.013). In addition, we identified a subset of 10 serum microRNAs (namely miR-125a-3p, miR-125b-2-3p, miR-380-5p, miR-125b-1-3p, let-7f-2-3p, let-7a-3p, let-7b-3p, miR-708-3p, miR-138-1-3p and miR-532-3p) that were most abundant in the PPB case. Serum levels of two representative microRNAs, miR-125a-3p and miR-125b-2-3p, were not elevated in DICER1 germline-mutated relatives. In the PPB case, serum levels of miR-125a-3p and miR-125b-2-3p increased before chemotherapy, and then showed an early reduction following treatment. These microRNAs may offer future utility as serum biomarkers for screening patients with known germline DICER1 mutations for early detection of PPB, and for potential disease-monitoring in cases with confirmed PPB.We would like to thank the following for providing financial support: SPARKS (NC, MJM), Medical Research Council Fellowship (MJM), TD Bank/LDI scholarship (LdK), Alex’s Lemonade Stand Foundation (WDF), Cancer Research UK (NC) and European Research Council under the European Union’s Seventh Framework Programme (FP/2007-2013)/ERC Grant Agreement No. 310018 (MT).This is the final published version. It first appeared at http://www.nature.com/oncsis/journal/v3/n2/full/oncsis20141a.html

    Identification of Recurrent Mutations in the microRNA-Binding Sites of B-Cell Lymphoma-Associated Genes in Follicular Lymphoma

    Get PDF
    Follicular lymphoma (FL) is a common indolent B-cell lymphoma that can transform into the more aggressive transformed FL (tFL). However, the molecular process driving this transformation is uncertain. In this work, we aimed to identify microRNA (miRNA)-binding sites recurrently mutated in follicular lymphoma patients, as well as in transformed FL patients. Using whole-genome sequencing data from FL tumors, we discovered 544 mutations located in bioinformatically predicted microRNA-binding sites. We then studied these specific regions using targeted sequencing in a cohort of 55 FL patients, found 16 recurrent mutations, and identified a further 69 variants. After filtering for QC, we identified 21 genes with mutated miRNA-binding sites that were also enriched for B-cell-associated genes by Gene Ontology. Over 40% of mutations identified in these genes were present exclusively in tFL patients. We validated the predicted miRNA-binding sites of five of the genes by luciferase assay and demonstrated that the identified mutations in BCL2 and EZH2 genes impaired the binding efficiency of miR-5008 and miR-144 and regulated the endogenous levels of messenger RNA (mRNA)

    Detecting microRNA binding and siRNA off-target effects from expression data.

    Get PDF
    Sylamer is a method for detecting microRNA target and small interfering RNA off-target signals in 3' untranslated regions from a ranked gene list, sorted from upregulated to downregulated, after a microRNA perturbation or RNA interference experiment. The output is a landscape plot that tracks occurrence biases using hypergeometric P-values for all words across the gene ranking. We demonstrated the utility, speed and accuracy of this approach on several datasets

    Identification of Recurrent Mutations in the microRNA-Binding Sites of B-Cell Lymphoma-Associated Genes in Follicular Lymphoma

    Get PDF
    Follicular lymphoma (FL) is a common indolent B-cell lymphoma that can transform into the more aggressive transformed FL (tFL). However, the molecular process driving this transformation is uncertain. In this work, we aimed to identify microRNA (miRNA)-binding sites recurrently mutated in follicular lymphoma patients, as well as in transformed FL patients. Using whole-genome sequencing data from FL tumors, we discovered 544 mutations located in bioinformatically predicted microRNA-binding sites. We then studied these specific regions using targeted sequencing in a cohort of 55 FL patients, found 16 recurrent mutations, and identified a further 69 variants. After filtering for QC, we identified 21 genes with mutated miRNA-binding sites that were also enriched for B-cell-associated genes by Gene Ontology. Over 40% of mutations identified in these genes were present exclusively in tFL patients. We validated the predicted miRNA-binding sites of five of the genes by luciferase assay and demonstrated that the identified mutations in BCL2 and EZH2 genes impaired the binding efficiency of miR-5008 and miR-144 and regulated the endogenous levels of messenger RNA (mRNA)

    CODA: Accurate Detection of Functional Associations between Proteins in Eukaryotic Genomes Using Domain Fusion

    Get PDF
    Background: In order to understand how biological systems function it is necessary to determine the interactions and associations between proteins. Gene fusion prediction is one approach to detection of such functional relationships. Its use is however known to be problematic in higher eukaryotic genomes due to the presence of large homologous domain families. Here we introduce CODA (Co-Occurrence of Domains Analysis), a method to predict functional associations based on the gene fusion idiom.Methodology/Principal Findings: We apply a novel scoring scheme which takes account of the genome-specific size of homologous domain families involved in fusion to improve accuracy in predicting functional associations. We show that CODA is able to accurately predict functional similarities in human with comparison to state-of-the-art methods and show that different methods can be complementary. CODA is used to produce evidence that a currently uncharacterised human protein may be involved in pathways related to depression and that another is involved in DNA replication.Conclusions/Significance: The relative performance of different gene fusion methodologies has not previously been explored. We find that they are largely complementary, with different methods being more or less appropriate in different genomes. Our method is the only one currently available for download and can be run on an arbitrary dataset by the user. The CODA software and datasets are freely available from ftp://ftp.biochem.ucl.ac.uk/pub/gene3d_data/v6.1.0/CODA/. Predictions are also available via web services from http://funcnet.eu/

    SCPS: a fast implementation of a spectral method for detecting protein families on a genome-wide scale

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>An important problem in genomics is the automatic inference of groups of homologous proteins from pairwise sequence similarities. Several approaches have been proposed for this task which are "local" in the sense that they assign a protein to a cluster based only on the distances between that protein and the other proteins in the set. It was shown recently that global methods such as spectral clustering have better performance on a wide variety of datasets. However, currently available implementations of spectral clustering methods mostly consist of a few loosely coupled Matlab scripts that assume a fair amount of familiarity with Matlab programming and hence they are inaccessible for large parts of the research community.</p> <p>Results</p> <p>SCPS (Spectral Clustering of Protein Sequences) is an efficient and user-friendly implementation of a spectral method for inferring protein families. The method uses only pairwise sequence similarities, and is therefore practical when only sequence information is available. SCPS was tested on difficult sets of proteins whose relationships were extracted from the SCOP database, and its results were extensively compared with those obtained using other popular protein clustering algorithms such as TribeMCL, hierarchical clustering and connected component analysis. We show that SCPS is able to identify many of the family/superfamily relationships correctly and that the quality of the obtained clusters as indicated by their F-scores is consistently better than all the other methods we compared it with. We also demonstrate the scalability of SCPS by clustering the entire SCOP database (14,183 sequences) and the complete genome of the yeast <it>Saccharomyces cerevisiae </it>(6,690 sequences).</p> <p>Conclusions</p> <p>Besides the spectral method, SCPS also implements connected component analysis and hierarchical clustering, it integrates TribeMCL, it provides different cluster quality tools, it can extract human-readable protein descriptions using GI numbers from NCBI, it interfaces with external tools such as BLAST and Cytoscape, and it can produce publication-quality graphical representations of the clusters obtained, thus constituting a comprehensive and effective tool for practical research in computational biology. Source code and precompiled executables for Windows, Linux and Mac OS X are freely available at <url>http://www.paccanarolab.org/software/scps</url>.</p

    Fusion and Fission of Genes Define a Metric between Fungal Genomes

    Get PDF
    Gene fusion and fission events are key mechanisms in the evolution of gene architecture, whose effects are visible in protein architecture when they occur in coding sequences. Until now, the detection of fusion and fission events has been performed at the level of protein sequences with a post facto removal of supernumerary links due to paralogy, and often did not include looking for events defined only in single genomes. We propose a method for the detection of these events, defined on groups of paralogs to compensate for the gene redundancy of eukaryotic genomes, and apply it to the proteomes of 12 fungal species. We collected an inventory of 1,680 elementary fusion and fission events. In half the cases, both composite and element genes are found in the same species. Per-species counts of events correlate with the species genome size, suggesting a random mechanism of occurrence. Some biological functions of the genes involved in fusion and fission events are slightly over- or under-represented. As already noted in previous studies, the genes involved in an event tend to belong to the same functional category. We inferred the position of each event in the evolution tree of the 12 fungal species. The event localization counts for all the segments of the tree provide a metric that depicts the “recombinational” phylogeny among fungi. A possible interpretation of this metric as distance in adaptation space is proposed

    Ortho2ExpressMatrix—a web server that interprets cross-species gene expression data by gene family information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The study of gene families is pivotal for the understanding of gene evolution across different organisms and such phylogenetic background is often used to infer biochemical functions of genes. Modern high-throughput experiments offer the possibility to analyze the entire transcriptome of an organism; however, it is often difficult to deduct functional information from that data.</p> <p>Results</p> <p>To improve functional interpretation of gene expression we introduce Ortho2ExpressMatrix, a novel tool that integrates complex gene family information, computed from sequence similarity, with comparative gene expression profiles of two pre-selected biological objects: gene families are displayed with two-dimensional matrices. Parameters of the tool are object type (two organisms, two individuals, two tissues, etc.), type of computational gene family inference, experimental meta-data, microarray platform, gene annotation level and genome build. Family information in Ortho2ExpressMatrix bases on computationally different protein family approaches such as EnsemblCompara, InParanoid, SYSTERS and Ensembl Family. Currently, respective all-against-all associations are available for five species: human, mouse, worm, fruit fly and yeast. Additionally, microRNA expression can be examined with respect to miRBase or TargetScan families. The visualization, which is typical for Ortho2ExpressMatrix, is performed as matrix view that displays functional traits of genes (differential expression) as well as sequence similarity of protein family members (BLAST e-values) in colour codes. Such translations are intended to facilitate the user's perception of the research object.</p> <p>Conclusions</p> <p>Ortho2ExpressMatrix integrates gene family information with genome-wide expression data in order to enhance functional interpretation of high-throughput analyses on diseases, environmental factors, or genetic modification or compound treatment experiments. The tool explores differential gene expression in the light of orthology, paralogy and structure of gene families up to the point of ambiguity analyses. Results can be used for filtering and prioritization in functional genomic, biomedical and systems biology applications. The web server is freely accessible at <url>http://bioinf-data.charite.de/o2em/cgi-bin/o2em.pl</url>.</p

    MINE: Module Identification in Networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks.</p> <p>Results</p> <p>MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the <it>C. elegans </it>protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties.</p> <p>Conclusions</p> <p>MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both <it>S. cerevisiae </it>and <it>C. elegans</it>.</p
    corecore